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Abstract. The goal of this lecture is to show how modern theorem 
provers — in this case, the Coq proof assistant — can be used to mech- 
anize the specification of programming languages and their semantics, 
and to reason over individual programs and over generic program trans- 
formations, as typically found in compilers. The topics covered include: 
operational semantics (small-step, big-step, definitional interpreters); a 
simple form of dcnotational semantics; axiomatic semantics and Hoare 
logic; generation of verification conditions, with application to program 
proof; compilation to virtual machine code and its proof of correctness; 
an example of an optimizing program transformation (dead code elimi- 
nation) and its proof of correctness. 



Introduction 

The semantics of a programming language describe mathematically the meaning 
of programs written in this language. An example of use of semantics is to define 
a programming language with much greater precision than standard language 
specifications written in English. (See for example the definition of Standard ML 
[38].) In turn, semantics enable us to formally verify some programs, proving that 
they satisfy their specifications. Finally, semantics are also necessary to establish 
the correctness of algorithms and implementations that operate over programs: 
interpreters, compilers, static analyzers (including type-checkers and bytecode 
verifiers), program provers, refactoring tools, etc. 

Semantics for nontrivial programming languages can be quite large and 
complex, making traditional, on-paper proofs using these semantics increasingly 
painful and unreliable. Automatic theorem provers and especially interactive proof 
assistants have great potential to alleviate these problems and scale semantic- 
based techniques all the way to realistic programming languages and tools. Pop- 
ular proof assistants that have been successfully used in this area include ACL2, 
Coq, H0L4, Isabelle/HOL, PVS and Twelf. 

The purpose of this lecture is to introduce students to this booming field 
of mechanized semantics and its applications to program proof and formal ver- 
ification of programming tools such as compilers. Using the prototypical IMP 
imperative language as a concrete example, we will: 

• mechanize various forms of operational and denotational semantics for this 
language and prove their equivalence (sections 1 and 2); 
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Figure 1. The various styles of semantics considered in this lecture and their uses. A double 
arrow denotes a semantic equivalence result. A single arrow from A to B means that semantics 
A is used to justify the correctness of B. 



• introduce axiomatic semantics (Hoare logic) and show how to provide ma- 
chine assistance for proving IMP programs using a verification condition 
generator (section 3); 

• define a non-optimizing compiler from IMP to a virtual machine (a small 
subset of the Java virtual machine) and prove the correctness of this com- 
piler via a semantic preservation argument (section 4); 

• illustrate optimizing compilation through the development and proof of 
correctness of a dead code elimination pass (section 5). 

We finish with examples of recent achievements and ongoing challenges in this 
area (section 6). 

We use the Coq proof assistant to specify semantics and program transforma- 
tions, and conduct all proofs. The best reference on Coq is Bertot and Casteran's 
book [13], but for the purposes of this lecture, Bertot's short tutorial [11] is largely 
sufficient. The Coq software and documentation is available as free software at 
http : //coq. inria. f r/. By lack of time, we will not attempt to teach how to 
conduct interactive proofs in Coq (but see the two references above). However, 
we hope that by the end of this lecture, students will be familiar enough with 
Coq's specification language to be able to read the Coq development underlying 
this lecture, and to write Coq specifications for problems of their own interest. 

The reference material for this lecture is the Coq development available 
at http : //gallium. inria. fr/~xleroy/courses/Marktoberdorf -2009/. These 
notes explain and recapitulate the definitions and main results using ordinary 
mathematical syntax, and provides bibliographical references. To help readers 
make the connection with the Coq development, the Coq names for the defini- 
tions and theorems are given as bracketed notes, [like this]. In the PDF version of 
the present document, available at the Web site above, these notes are hyper- 
links pointing directly to the corresponding Coq definitions and theorems in the 
development. 



1. Symbolic expressions 



1.1. Syntax 

As a warm-up exercise, we start by formalizing the syntax and semantics of a 
simple language of expressions comprising variables (a;, y, . . . ), integer constants 
n, and two operators + and — . 

Expressions: [expr] 

e ::= x \ n | e\ + e 2 | e\ - e 2 

The Coq representation of expressions is as a inductive type, similar to an 
ML or Haskell datatype. 

Definition ident := nat . 
Inductive expr : Type := 

I Evar: ident -> expr 

I Econst : Z -> expr 

I Eadd: expr -> expr -> expr 

I Esub : expr -> expr -> expr . 

nat and Z are predefined types for natural numbers and integers, respectively. 
Each case of the inductive type is a function that constructs terms of type expr. 
For instance, Evar applied to the name of a variable produces the representation 
of the corresponding expression; and Eadd applied to the representations of two 
subexpressions e\ and e 2 returns the representation of the expression e\ + e 2 . 
Moreover, all terms of type expr are finitely generated by repeated applications 
of the 4 constructor functions; this enables definitions by pattern matching and 
reasoning by case analysis and induction. 

1.2. Denotational semantics 

The simplest and perhaps most natural way to specify the semantics of this lan- 
guage is as a function [e] s that associates an integer value to the expression e in 
the state s. States associate values to variables. 



In Coq, this denotational semantics is presented as a recursive function 

[evaLexpr]. 

Definition state := ident -> Z. 

Fixpoint eval_expr (s: state) (e: expr) {struct e> : Z := 
match e with 
I Evar x => s x 
I Econst n => n 

I Eadd el e2 => eval_expr s el + eval_expr s e2 
I Esub el e2 => eval_expr s el - eval_expr s e2 
end. 



[x] s = s(x) 
[ei + e 2 ] a = [ei] s + [e 2 ] a 



{n] s = n 
[ei - e 2 ] s = [ei] s - [e 2 ] s 



Fixpoint marks a recursive function definition. The struct e annotation states 
that it is structurally recursive on its e parameter, and therefore guaranteed 
to terminate. The match. . .with construct represents pattern- matching on the 
shape of the expression e. 

1.3. Using the denotational semantics 

The eval_expr function can be used as an interpreter, to evaluate expressions in 
a known environment. For example: 

Eval compute in ( 

let x : ident := in 

let s : state := fun y => if eq_ident y x then 12 else in 
eval_expr s (Eadd (Evar x) (Econst 1))). 

Coq prints "13 : Z". For additional performance, efficient executable Caml code 
can also be generated automatically from the Coq definition of eval_expr using 
the extraction mechanism of Coq. 

Another use of eval_expr is to reason symbolically over expressions in arbi- 
trary states. Consider the following claim: 

Remark expr_add_pos : 
forall s x, 

s x >= -> eval_expr s (Eadd (Evar x) (Econst 1)) > 0. 

Using the simpl tactic of Coq, the goal reduces to a purely arithmetic statement: 

forall sx, sx>=0->sx+l>0. 

which can be proved by standard arithmetic (the omega tactic). 

Finally, the denotation function eval_expr can also be used to prove "meta" 
properties of the semantics. For example, we can easily show that the denotation of 
an expression is insensitive to values of variables not mentioned in the expression. 

Lemma eval_expr_domain: 
forall si s2 e, 

(forall x, is_free x e -> si x = s2 x) -> 
eval_expr si e = eval_expr s2 e. 

The proof is a simple induction on the structure of e. The predicate is_free, 
stating whether a variable occurs in an expression, is itself defined as a recursive 
function: 

Fixpoint is_free (x: ident) (e: expr) {struct e} : Prop := 
match e with 
I Evar y => x = y 
I Econst n => False 

I Eadd el e2 => is_free x el \/ is_free x e2 
I Esub el e2 => is_free x el \/ is_free x e2 
end. 

As the Prop annotation indicates, the result of this function is not a data type 
but a logical formula. 



1.4- Variants 



The denotational semantics we gave above interprets the + and - operators as 
arithmetic over mathematical integer. We can easily interpret them differently, 
for instance as signed, modulo 2 32 arithmetic (as in Java): 

Fixpoint eval_expr (s: state) (e: expr) {struct e> : Z := 
match e with 
I Evar x => s x 
I Econst n => normalize n 

I Eadd el e2 => normalize (eval_expr s el + eval_expr s e2) 
I Esub el e2 => normalize (eval_expr s el - eval_expr s e2) 
end. 

Here, normalize n is n reduced modulo 2 32 to the interval [— 2 31 ,2 31 ). 

We can also account for undefined expressions. In practical programming 
languages, the value of an expression can be undefined for several reasons: if it 
mentions a variable that was not previously defined; in case of overflow during 
an arithmetic operation; in case of an integer division by 0; etc. A simple way to 
account for undefinedness is to use the option type, as defined in Coq's standard 
library. This is a two-constructor inductive type with None meaning "undefined" 
and Some n meaning "defined and having value n" . 

Definition state := ident -> option Z. 

Fixpoint eval_expr (s: state) (e: expr) {struct e~} : option Z : = 
match e with 
I Evar x => s x 
I Econst n => Some n 
I Eadd el e2 => 

match eval_expr s el, eval_expr s e2 with 

I Some nl , Some n2 => Some (nl + n2) 

I _, _ => None 

end 

I Esub el e2 => 

match eval_expr s el, eval_expr s e2 with 

I Some nl , Some n2 => Some (nl - n2) 

I _, _ => None 
end 

end. 
1.5. Summary 

The approach we followed in this section — denotational semantics represented as 
a Coq recursive function — is natural and convenient, but limited by a fundamen- 
tal aspect of Coq: all functions must be terminating, so that they are defined 
everywhere by construction. The termination guarantee can come either by the 
fact that they are structurally recursive (recursive calls are only done on strict 
sub-terms of the argument, as in the case of eval_expr), or by Noetherian re- 
cursion on a well-founded ordering. Consequently, the approach followed in this 
section cannot be used to give semantics to languages featuring general loops or 



general recursion. As we now illustrate with the IMP language, we need to move 
away from functional presentations of the semantics (where a function computes a 
result given a state and a term) and adopt relational presentations instead (where 
a ternary predicate relates a state, a term, and a result). 



2. The IMP language and its semantics 



2.1. Syntax 



The IMP language is a very simple imperative language with structured control. 
Syntactically, it extends the language of expressions from section 1 with boolean 
expressions (conditions) and commands (statements): 

Expressions: [expr] 

e ::= x \ n \ e x + e 2 | ei - e 2 

Boolean expressions: [booLexpr] 
b ::= ei = e 2 | ei < e 2 

Commands: [cmd] 

c ::— skip | x := e | C\\ c 2 | if b then Ci else c 2 | while b do c done 



The semantics of boolean expressions is given in the denotational style of section 1, 
as a function from states to booleans [evaLbooLexpr]. 

[ei = e 2 ] s = ( true if ^ s = ^ s; 
I false otherwise. 

[ei < e 2 ] s = ( true if I ei l s < I e2 l s; 
L false otherwise. 

2.2. Reduction semantics 



A standard way to give semantics to languages such as IMP, where programs may 
not terminate, is reduction semantics, popularized by Plotkin under the name 
"structural operational semantics" [49] , and also called "small-step semantics" . 
It builds on a reduction relation (c, s) — > (c',s'), meaning: in initial state s, the 
command c performs one elementary step of computation, resulting in modified 
state s' and residual computations c'. [red] 

(x := e, s) — > (skip, s[x ^— [e] s]) (red_assign) 

(ci,s) (c[,s) 
(red_seq_left) 

((ci;c 2 ), s) -> ((c' i; c 2 ), s') 

((skip;c), s)^(c,s) (red_seq_skip) 

^ S = trUS (red_if_true) 



((if b then C\ else c 2 ), s) — > (ci, s) 



(rcd_if_false) 



((if b then ci else C2), s) — > (C2, s) 
[6] s = true 

((while b do c done), s) — > ((c; while do c done), s) 
|61 s = false 



(red_while_true) 



((while b do c done),s) — > (skip, s) 



(red_while_false) 



The Coq translation of such a definition by inference rules is called an induc- 
tive predicate. Such predicates build on the same inductive definition mechanisms 
that we already use to represent abstract syntax trees, but the resulting logical 
object is a proposition (sort Prop) instead of a data type (sort Type). 

The general recipe for translating inference rules to an inductive predicate 
is as follows. First, write each axiom and rule as a proper logical formula, using 
implications and universal quantification over free variables. For example, the rule 
red_seq_lef t becomes 

forall cl c2 s cl' s', 

red (cl, s) (cl' , s') -> 

red (Cseq cl c2, s) (Cseq cl' c2, s') 

Second, give a name to each rule. (These names are called "constructors" , by anal- 
ogy with data type constructors.) Last, wrap these named rules in an inductive 
predicate definition like the following. 

Inductive red: (cmd * state) -> (cmd * state) -> Prop := 
red_assign: forall x e s, 

red (Cassign x e, s) (Cskip, update s x (eval_expr s e)) 
red_seq_lef t : forall cl c2 s cl' s', 
red (cl, s) (cl' , s') -> 
red (Cseq cl c2, s) (Cseq cl' c2, s') 
red_seq_skip: forall c s, 

red (Cseq Cskip c, s) (c, s) 
red_if_true: forall s b cl c2, 
eval_bool_expr s b = true -> 
red (Cif thenelse b cl c2, s) (cl, s) 
red_if _f alse : forall s b cl c2, 
eval_bool_expr s b = false -> 
red (Cif thenelse b cl c2, s) (c2, s) 
red_while_true : forall s b c, 
eval_bool_expr s b = true -> 
red (Cwhile be, s) (Cseq c (Cwhile be), s) 
red_while_f alse : forall b c s, 
eval_bool_expr s b = false -> 
red (Cwhile b c, s) (Cskip, s) . 



Each constructor of the definition is a theorem that lets us conclude 
red (c, s) (c', s') when the corresponding premises hold. Moreover, the propo- 
sition red (c, s) (c',s') holds only if it was derived by applying these theorems 



a finite number of times (smallest fixpoint). This provides us with powerful 
reasoning principles: by case analysis on the last rule used, and by induction on 
a derivation. Consider for example the determinism of the reduction relation: 

Lemma red_deterministic : 

forall cs csl, red cs csl -> forall cs2, red cs cs2 -> csl = cs2. 

It is easily proved by induction on a derivation of red cs csl and a case analysis 
on the last rule used to conclude red cs cs2. 

From the one-step reduction relation, we can define the the behavior of a 
command c in an initial state s is obtained by forming sequences of reductions 
starting at c,s: 

• Termination with final state s' , written (c, s) JL s': finite sequence of reduc- 
tions to skip, [terminates] 

(c, s) A (skip, s') 

• Divergence, written (c, s) -ft- : infinite sequence of reductions, [diverges] 

Vc'J -oralis', (c,s) A (c',s') 3c", 3s", (c',s') -> (c",s") 

• Going wrong, written (c, s) JJ. wrong: finite sequence of reductions to an 
irreducible state that is not skip, [goes.wrong] 

(c, s) —>•••• —> (c', s') with c 7^ skip 

2.3. Natural semantics 

An alternative to structured operational semantics is Kahn's natural semantics 
[26], also called big-step semantics. Instead of describing terminating executions as 
sequences of reductions, natural semantics aims at giving a direct axiomatization 
of executions using inference rules. 

To build intuitions for natural semantics, consider a terminating reduction 
sequence for the command c;c'. 

((c;c'), s ->■ ((ci;c'), Si) -> > ((skip;c'), s 2 ) ->■ (c', s 2 ) ->■ > (skip, s 3 ) 

It contains a terminating reduction sequence for c, of the form (c, s) A (skip, S2), 
followed by another terminating sequence for (c',S2)- 

The idea of natural semantics is to write inference rules that follow this struc- 
ture and define a predicate c,s=>s', meaning "in initial state s, the command c 
terminates with final state s 1 " . [exec] 

x := e, s => s[x <— [e] s] (cxcc.assign) 

C\, s => s' if [fe] s = true 

c 2 ,s ^> s' if [6] s = false 
(exec_ir) 

(if 6 then c\ else C2), s ^> s' 



skip, s s (execskip) 
Ci, s Si C2, Si => S2 



(execseq) 



ClJ C2, S ^> S2 



[6] s = false 



(exec_while_stop) 



while b do c done, s => s 



[6] s = true c, s =>■ si while 6 do c done, si =>■ S2 



(exec_while_loop) 



while 6 do c done, s => S2 



We now have two different semantics for the same language. A legitimate 
question to ask is whether they are equivalent: do both semantics predict the 
same "terminates / diverges / goes wrong" behaviors for any given program? 
Such an equivalence result strengthens the confidence we have in both semantics. 
Moreover, it enables us to use whichever semantics is more convenient to prove 
a property of interest. We first show an implication from natural semantics to 
terminating reduction sequences. 

Theorem 1 [exec.terminates] // c, s s' , then (c, s) A (skip, s'). 

The proof is a straightforward induction on a derivation of c, s => s' and 
case analysis on the last rule used. Here is a representative case: c = C\;C2- By 
hypothesis, c\\ c 2 , s => s' . By inversion, we know that C\, s => Si and c 2 , Si => s' for 
some intermediate state s\. Applying the induction hypothesis twice, we obtain 
(ci, s) A (skip, si) and (C2, si) A (skip, s'). A context lemma (proved separately 
by induction) shows that ((ci;C2),s) A ((skip; C2), Si). To obtain the expected 
result, all we need to do is to assemble the reduction sequences together, using 
the transitivity of — K 



The converse implication (from terminating reduction sequences to natural 
semantics) is more difficult. The idea is to consider mixed executions that start 
with some reduction steps and finish in one big step using the natural semantics: 



We first show that the last reduction step can always be "absorbed" by the final 
big step: 

Lemma 2 [red_preserves_exec] If (c, s) — > (c', s') and c', s' s" , then c, s s" . 

Combining this lemma with an induction on the sequence of reduction, we 
obtain the desired semantic implication: 

Theorem 3 [terminates.exec] // (c, s) A (skip, s'), then c, s s' . 
2.4. Natural semantics for divergence 

Kahn-style natural semantics correctly characterize programs that terminate, ei- 
ther normally (as in section 2.3) or by going wrong (through the addition of so- 
called error rules). For a long time it was believed that natural semantics is un- 



((ci;c 2 ),s) A ((skip;c 2 ),si) -> (c 2 ,si) A (skip,/ 



) 



(ci,Si) ->■ > (Ci,Si 




able to account for divergence. As observed by Grail and Leroy [32], this is not 
true: diverging executions can also be described in the style of natural semantics, 
provided a coinductive definition (greatest fixpoint) is used. Define the infinite 
execution relation c, s => oo (from initial state s, the command c diverges), [execinf] 

ci, s => oo 
^^^=^^^= (exccinf_seq_left) 



Ci! C2, s OO 

ci, s => si c 2 , Si =>■ oc 
ci ; C2 , s =>■ oo 



(execinf_seq_right) 



ci, s oo if [6] s = true 

c 2 , s =>■ oo if [6] s = false 
^^^^^^^^^^^^^^^^= (cxccinLif) 

if b then ci else C2, s => oo 
[6] ,s = true c, s oo 



while & do c done, s => oo 
bl s = true c, s => si while 6 do c done, s\ => oo 



(execinf_whilc_body) 

(execinf_whilc_loop) 



while & do c done, ,s => oo 

As denoted by the double horizontal bars, these rules must be interpreted 
coinductively as a greatest fixpoint [32, section 2]. Equivalently, the coinductive 
interpretation corresponds to conclusions of possibly infinite derivation trees, while 
the inductive interpretation corresponds to finite derivation trees. Coq provides 
built-in support for coinductive definitions of data types and predicates. 

As in section 2.3 and perhaps even more so here, we need to prove an equiv- 
alence between the c,s ^> m predicate and the existence of infinite reduction 
sequences. One implication follows from the decomposition lemma below: 

Lemma 4 [execinf_red_step] // c, s => oo, there exists c' and s' such that (c, s) — > 
(c', s') and c', s' =>■ oo. 

A simple argument by coinduction, detailed in [32], then concludes the ex- 
pected implication: 

Theorem 5 [execinf.diverges] If c, s oo, then (c, s) ft . 
The reverse implication uses two inversion lemmas: 

• If ((ci;c2), s) ft , either (ci,s) ft or there exists s' such that (ci,s) A 
(skip, s') and (c2,s') ft . 

• If (while b do c done, s) ft , then [6] s = true and either (c, s) ft or there 
exists s' such that (c, s) A (skip, s') and (while 6 do c done, s') ft 

These lemmas follow from determinism of the — > relation and the seemingly 
obvious fact that any reduction sequence is either infinite or stops, after finitely 
many reductions, on an irreducible configuration: 



Vc,s, (c,s) ff V 3c', 3s', (c,s) A (c',s') A(c',s') 7^ 

The property above cannot be proved in Coq's constructive logic: such a con- 
structive proof would be, in essence, a program that decides the halting problem. 
However, we can add the law of excluded middle (VP, P V -P) to Coq as an 
axiom, without breaking logical consistency. The fact above can easily be proved 
from the law of excluded middle. 

Theorem 6 [diverges.execinf] If (c, s) ff , then c,s^>oo. 
2.5. Definitional interpreter 

As mentioned at the end of section 1, we cannot write a Coq function with type 
cmd — > state — > state that would execute a command and return its final state 
whenever the command terminates: this function would not be total. We can, 
however, define a Coq function I{n, c, s) that executes c in initial state s, taking 
as extra argument a natural number n used to bound the amount of computation 
performed. This function returns either [s'\ (termination with state s') or _L 
(insufficient recursion depth), [interp] 



1(0, c, s 
I(n + 1, skip, s 
I(n + 1, x := e, s 
l(n + l,(ci;c 2 ),s 
I(n + 1, (if b then c\ else c 2 ), s 
I(n + 1, (if b then c\ else C2), s 
X(n + 1, (while b do c done), s 
I(n + 1, (while b do c done), s 



: _L 

= W 

: [s[x <r- [e] s] J 
: I(n, ci, s) > (As'. X(n, c 2 , s')) 
: T(n, ci, s) if [6] s = true 
= T(n, C2, s) if [6] s = false 
: [s\ if [6] s = false 

= I(n, c, s) > (As'. I(n, while b do c done, s')) 
if |M s = true 



The "bind" operator >, reminiscent of monads in functional programming, is 
defined by _L > / = _L and [s\ t> f = f(s). 

A crucial property of this definitional interpreter is that it is monotone with 
respect to the maximal recursion depth n. Evaluation results are ordered by taking 
_L < [s\ [res.le]. 

Lemma 7 [interp.mon] (Monotonicity of I.) If n < m, thenl(n,c,s) <l(m,c,s). 

Exploiting this property, we can show partial correctness results of the defi- 
nitional interpreter with respect to the natural semantics: 

Lemma 8 [interp.exec] IfX(n,c,s) — [s'\, then c, s => s' . 

Lemma 9 [execinterp] If c, s => s' , there exists an n such that X(n, c, s) = [s'J . 



Lemma 10 [execinfJnterp] If c, s => oo, then X(n, c, s) = _L for all n. 



2.6. Denotational semantics 



A simple form of denotational semantics [41] can be obtained by "letting n goes 
to infinity" in the definitional interpreter. 

Lemma 11 [interpjimit.dep] For every c, there exists a function [c] from states to 
evaluation results such that "is, 3m, Vn > to, X(n,c,s) — [c] s. 

Again, this result cannot be proved in Coq's constructive logic and requires 
the axiom of excluded middle and an axiom of description. 

This denotation function [c] satisfies the equations of denotational semantics: 



[if b then c\ else C2] s — [ci] s if [6] s = true 
[if b then c\ else C2] s = [C2] s if [6] s = false 
[while b do c done] s = [sj if [6] s = false 

[while b do c done] s = [c] s > (As', [while b do c done] s') if [6] s = true 

Moreover, [while b do c done] is the smallest function from states to results that 
satisfies the last two equations. 

Using these properties of [c] , we can show full equivalence between the deno- 
tational and natural semantics. 

Theorem 12 [denot.exec] [exec.denot] c, s => s' if and only if [c] s = [s'\ . 
Theorem 13 [denot.execinf] [execinf.denot] C, S => 00 if and only if [c] s = _L. 
2.7. Further reading 

The material presented in this section is inspired by Nipkow [44] (in Isabcllc/HOL, 
for the IMP language) and by Grail and Leroy [32] (in Coq, for the call-by- value 
A-calculus). 

We followed Plotkin's "SOS" presentation [49] of reduction semantics, char- 
acterized by structural inductive rules such as [ red _seq .left]. An alternate presen- 
tation, based on reduction contexts, was introduced by Wright and Felleisen [54] 
and is very popular to reason about type systems [48] . 

Definitions and proofs by coinduction can be formalized in two ways: as great- 
est fixpoints in a set-theoretic presentation [1] or as infinite derivation trees in 
proof theory [13, chap. 13]. Grail and Leroy [32] connect the two approaches. 

The definitional interpreter approach was identified by Reynolds in 1972. See 
[50] for a historical perspective. 

The presentation of denotational semantics we followed avoids the complexity 
of Scott domains. Mechanizations of domain theory with applications to denota- 
tional semantics include Agerholm [2] (in HOL), Paulin [46] (in Coq) and Benton 
et al. [9] (in Coq). 



[skip] 

[x := e] 
[ci;c 2 ] 



s 



s 



s 



[s[x <- [e] s]j 

[ci] a > (\s>. [c 2 ] *') 



3. Axiomatic semantics and program verification 



Operational semantics as in section 2 focuses on describing actual executions of 
programs. In contrast, axiomatic semantics (also called Hoare logic) focuses on 
verifying logical assertions between the values of programs at various program 
points. It is the most popular approach to proving the correctness of imperative 
programs. 

3.1. Weak Hoare triples and their rules 

Following Hoare's seminal work [24], we consider logical formulas of the form 
{ P } c { Q }, meaning "if precondition P holds, the command c does not go wrong, 
and if it terminates, the postcondition Q holds". Here, P and Q are arbitrary 
predicates over states. A formula {P} c {Q} is called a weak Hoare triple (by 
opposition with strong Hoare triples discussed in section 3.4, which guarantee 
termination as well). We first define some useful operations over predicates: 

P[xi-e] d = As. P(s[x <- [e] s]) PAQ d = As. P(s) A Q(s) 

b true d = As. [6] s = true P V Q d = As. P(s) V Q(s) 

b false d = As. [bj s = false P => Q d = Vs, P(s) Q(s) 

The axiomatic semantics, that is, the set of legal triples {P}c{Q}, is defined 
by the following inference rules: [triple] 

{ P } skip { P } (triple_skip) { P[x ^ e]} x := e { P } (triple_assign) 

{P} Cl {Q} {Q}c 2 {R} 



{P} cv,c 2 {R} 
{ 6 true A P} Ci {Q} { b false A P } c 2 { Q } 



(triple_seq) 

(triple_if) 



{ P } if b then c\ else c 2 { Q } 
{ b true A P } c {P} 
{ P } while 6 do c done { b false A P } 

P P' { P' } c { Q' } Q' =► Q 



(triple_whilc) 



(triple_consequence) 
{P}c{Q} 



Example. The triple {a = bq + r} r := r — b: q:=q+l{a = bq + r} is derivable 
from rules triple_assign, triple_seq and triple_consequence because the 
following logical equivalences hold: 

(a = bq + r)[q <- q + 1] a = b(q + 1) + r 
(a = b(q + 1) + r)[r «- r - 6] a = 6(g + 1) + (r - &) = &g + r 



3.2. Soundness of the axiomatic semantics 



Intuitively, a weak Hoare triple {P} c {Q} is valid if for all initial states s such 
that P s holds, either (c, s) diverges or it terminates in a state s' such that Q s' 
holds. We capture the latter condition by the predicate (c, s) finally Q, defined 
coinductively as: [finally] 



^^^^^=^^^^^= (finally _done) 
(skip, s) finally Q 

(c,s) -> (</,«') (c',s') finally Q 
^^^^^^^^^^=^^^^^^^^^^= (finally _step) 

(c, s) finally Q 

In an inductive interpretation, rule finally_step could only be applied 
a finite number of steps, and therefore (c, s) finally Q would be equivalent 
to 3s', (c, s) A (skip, s') A Q(s'). In the coinductive interpretation, rule 
f inally_step can also be applied infinitely many times, capturing diverging ex- 
ecutions as well. 

The semantic interpretation [{ P } c { Q }] of a triple is, then, the proposition 

Vs, P S ==>■ (c, s) finally Q [sem.triple] 

We now proceed to show that if {P} c {Q} is derivable, the proposition 
[{ P } c { Q II above holds. We start by some lemmas about the finally predi- 
cate. 

Lemma 14 [finally_seq] If (ci,s) finally Q and \{Q} c-i {i?}] 7 then 
((ci;c 2 ),s) finally R. 

Lemma 15 [finally.while] If {{ b true A P } c { P }] then 
{{P} while b do c done {b false A P}]. 

Lemma 16 [finally.consequence] If (c, s) finally Q and Q Q' , then 
(c, s) finally Q' . 

We can then prove the expected soundness result by a straightforward induc- 
tion on a derivation of {P} c {Q}: 

Theorem 17 [triple.correct] If {P} c {Q} can be derived by the rules of axiomatic 
semantics, then [{ P } c { Q }] holds. 

3. 3. Generation of verification conditions 

In this section, we enrich the syntax of IMP commands with an annotation on 
while loops (to give the loop invariant) and an assert(P) command to let the 
user provide assertions, [acmd] 



Annotated commands: 

c ::= while b do {P} c done loop with invariant 

assert(P) explicit assertion 

... other commands as in IMP 

Annotated commands can be viewed as regular commands by erasing the {P} 
annotation on loops and turning assert(P) to skip, [erase] 

The wp function computes the weakest (liberal) precondition for c given a 
postcondition Q. [wp] 

wp(skip, Q) = Q 
wp(x := e, Q) = Q[x <— e] 
wp((ci;c 2 ),<3) =wp(ci,wp(c 2 ,Q)) 
wp((if b then C\ else c 2 ), Q) = (b true A wp(ci, Q)) V (b false A wp(c 2 , Qj) 
wp((while b do {P} c done), Q) = P 
wp(assert(P),Q) = P 

With the same arguments, the vcg function (verification condition gener- 
ator) computes a conjunction of implications that must hold for the triple 
{ wp(c, Q) } c { Q } to hold, [vcg] 

vcg(skip, Q)=T 
vcg(x := e,Q) = T 

vcg((ci;c 2 ),<9) = vcg(ci,wp(c 2 ,Q)) A vcg(c 2 ,Q) 
vcg((if b then c\ else c 2 ), Q) = vcg(a,Q) A vcg(c 2 , Q) 
vcg((while b do {P} c done), Q) = vcg(c, P) 

A (b false A P => Q) 
A (6 true A P wp(c, P)) 
vcg(assert(P),Q) = P =^ Q 

Lemma 18 [vcg.correct] If vcg(c, Q) holds, then { wp(c, Q) } c { Q } c<m 6e derived 
by the rules of axiomatic semantics. 

The derivation of a Hoare triple { P } c {Q} can therefore be reduced to the 
computation of the following vcgen(P, c, Q) logical formula, and its proof, [vcgen] 

dcf 

vcgen(P,c,Q) = (P =^ wp(c, Q)) A vcg(c, Q) 

Theorem 19 [vcgen.correct] 7/vcgen(P, c, Q) /io/ds, i/ien { P } c { Q } can be derived 
by the rules of axiomatic semantics. 



Example. Consider the following annotated IMP program c: 
r := a; q := 0; 

while b < r+1 do {/} r:=r-b;q:=q+l done 
and the following precondition P, loop invariant / and postcondition Q: 

As. s(a) > A s(b) > 

As. s(r) > A s(b) > A s(a) = s(b) x s(q) + s(r) 
As. s(q) = s(a)/s(b) 

To prove that {P} c {Q}, we apply theorem 19, then ask Coq to compute and 
simplify the formula vcgen(P, c, Q). We obtain the conjunction of three implica- 
tions: 

s(a) > A s(b) > s(a) > A s(b) > A s(a) = s(b) x + s(a) 

-.(s(b) < s(r) + 1) A s(r) > A s(b) > A s(a) = s(b) x s(q) + s(r) 
=► s(q) = s(a)/s(b) 

s(b) < s(r) + 1 A s(r) > A s(b) > A s(a) = s(b) x s(q) + s(r) 

=^ s(r) - s(b) > A s(b) > A s(a) = s(b) x (s(q) + 1) + (s(r) - s(b)) 

which are easy to prove by purely arithmetic reasoning. 



p d A { 



def 



Q d ^ 



3.4. Strong Hoare triples 

The axiomatic semantics we have seen so far enables us to prove partial correctness 
properties of programs, but not their termination. To prove termination as well, 
we need to use strong Hoare triples [P] c [Q], meaning "if precondition P holds, 
the command c terminates and moreover the postcondition Q holds" . 

The rules defining valid strong Hoare triples are similar to those for weak 
triples, with the exception of the while rule, which contains additional require- 
ments that ensure termination of the loop. [Triple] 

[P] skip [P] (Triple_skip) [P[x <- e] ] x := e [P] (Triple_assign) 

[P]ci[Q] [Q]cz[R] 



[P] ci;c 2 [R] 
[b true A P] C\ [Q] [b false A P] c 2 [Q] 
[P] if b then C\ else c 2 [Q] 

(Vw G Z, [ b true Ae ro = »AP]c[0<e m <i)AP 



[P] while b do c done [b false A P] 
P^P' [P'}c[Q'\ Q' =t 



(Triple_seq) 

(TripleJf) 
1 (Triplc.while) 



(Triple_consequence) 

P]c[Q] 



In the Triple_while rule, e m stands for an expression whose value should 
decrease but remain nonnegative at each iteration. The precondition e m = v and 
the postcondition < e m < v capture this fact: 

e m = v = f As. [e m ] s = v < e m < v d = As. < [e m ] s < w 

The v variable therefore denotes the value of the measure expression at the 
beginning of the loop body. Since it is not statically known in general, rule 
Triple_while quantifies universally over every possible v £ Z. Conceptually, rule 
Triple_while has infinitely many premises, one for each possible value of v. Such 
infinitely branching inference rules cause no difficulty in Coq. 

Note that the Triple_while rule above is not powerful enough to prove 
termination for some loops that occur in practice, for example if the termination 
argument is based on a lexicographic ordering. A more general version of the rule 
could involve an arbitrary well-founded ordering between states. 

The semantic interpretation [[ P ] c [Q]] of a strong Hoare triple is the propo- 
sition 

Vs, P S => 3s', (c, S =>• s') A Q(s') [sem.Triple] 

As previously done for weak triples, we now prove the soundness of the infer- 
ence rules for strong triples with respect to this semantic interpretation. 

Theorem 20 [Triple.correct] If [P] c [Q] can be derived by the rules of axiomatic 
semantics, then [[P] c [Q]] holds. 

The proof is by an outer induction on a derivation of [ P ] c [Q] followed, 
in the while case, by an inner induction on the value of the associated measure 
expression. 

3.5. Further reading 

The material in this section follows Nipkow [44] (in Isabelle/HOL) and Bertot 
[12] (in Coq), themselves following Gordon [37]. 

Separation logic [45,51] extends axiomatic semantics with a notion of local 
reasoning: assertions carry a domain (in our case, a set of variable; in pointer 
programs, a set of store locations) and the logic enforces that nothing outside the 
domain of the triple changes during execution. Examples of mechanized separation 
logics include Marti et al. [35] in Coq, Tuch et al. [53] in Isabelle/HOL, Appel 
and Blazy [5] in Coq, and Myreen and Gordon [43] in HOL4. 

The generation of verification conditions (section 3.3) is an instance of a 
more general technique known as "proof by reflection" , which aims at replacing 
deduction steps by computations [13, chap. 16]. The derivation of {P} c {Q} 
from the rules of section 3.1 (a nonobvious process involving nondeterminstic 
proof search) is replaced by the computation of vcgen(P, c, Q) (a trivial evaluation 
of a recursive function application). Proofs by reflection can tremendously speed 
up the verification of combinatorial properties, as illustrated by Gonthier and 
Werner's mechanized proof of the 4-color theorem [22]. 



4. Compilation to a virtual machine 



There are several ways to execute programs: 

• Interpretation: a program (the interpreter) traverses the abstract syntax 
tree of the program to be executed, performing the intended computations 
on the fly. 

• Compilation to native code: before execution, the program is translated to 
a sequence of machine instructions. These instructions are those of a real 
microprocessor and are executed in hardware. 

• Compilation to virtual machine code: before execution, the program is 
translated to a sequence of instructions, These instructions are those of a 
virtual machine. They do not correspond to that of an existing hardware 
processor, but are chosen close to the basic operations of the source lan- 
guage. Then, the virtual machine code is either interpreted (more efficiently 
than source-level interpretation) or further translated to real machine code. 

In this section, we study the compilation of the IMP language to an appropriate 
virtual machine. 

4-1- The IMP virtual machine 

A state of the machine is composed of: [machine_state] 

• A fixed code C (a list of instructions) . 

• A variable program counter pc (an integer position in C). 

• A variable stack a (a list of integers). 

• A store s (mapping variables to integers). 

The instruction set is as follows: [instruction] [code] 



In branch instructions, S is an offset relative to the next instruction. 

The dynamic semantics of the machine is given by the following one-step 
transition relation [transition]. C(pc) is the instruction at position pc in C, if any. 



i ::= const(n) 



push n on stack 

push value of x 

pop value and assign it to x 

pop two values, push their sum 

pop two values, push their difference 

unconditional jump 

pop two values, jump if ^ 

pop two values, jump if > 

end of program 



var(ir) 
setvar(x) 
add 
sub 

branch(<5) 
bne(5) 
bge(5) 
halt 



C h (pc, ct, s) — > (pc + 1, n.cr, s) if C(pc) 

C h (pc, cr, s) — > (pc + 1, s.(x).cr, s) if C(pc) 

C h (pc, n.cr, s) — > (pc + 1, ct, s[x <— n]) if C(pc) 

C h (pc, n2.n1.cr, s) ->• (pc + 1, (ni + n 2 ).cr, s) if C(pc) 
C h (pc, n2.ni.cr, s) — > (pc + 1, (ni — n 2 ).cr, s) if C(pc) 
C h (pc, cr, s) — » (pc + 1 + (5, cr, s) if C(pc) 

C h (pc, n 2 .ni.cr, s) — > (pc + 1 + S, a, s) if C(pc) 

C h (pc, n2.n1.cr, s) — > (pc + 1, cr, s) if C(pc) 

C h (pc, n2.n1.cr, s) — > (pc + 1 + S, cr, s) if C(pc) 

C h (pc, n2.n1.cr, s) — > (pc + 1, cr, s) if C(pc) 



const(n) 

var(n) 

setvar(x) 

add 

sub 

branch^) 
bne((5) and ni ^ n 2 
bne(S) and ni = n2 
bge((5) and ni > n 2 
bge((5) and ni < n2 



As in section 2.2, the observable behavior of a machine program is defined by 
sequences of transitions: 

• Termination C h (pc, cr, s) ^ s' if 

C h (pc, cr, s) A (pc', cr', s') and C(pc') = halt. 

• Divergence C h (pc, cr, s) ff- if the machine makes infinitely many transitions 
from (pc, cr, s). 

• Going wrong, otherwise. 

Example. The table below depicts the first 4 transitions of the execution of the 
code var(x); const(l); add; setvar(a;); branch(— 5). 



stack 


e 


12.e 


1.12.e 


13.£ 


store 


x 1 y 1 2 


x i-> 12 


.T 1 ^ 12 


x ^ 12 


p.c. 





1 


2 


3 



code 



£ 

x ^ 13 

4 

var(x); const(l); add; setvar(x); branch(— 5) 



The fifth transition executes the branch(— 5) instruction, setting the program 
counter back to 0. The overall effect is that of an infinite loop that increments x 
by 1 at each iteration. 



4-2. The compilation scheme 

The code comp(e) for an expression evaluates e and pushes its value on top of 
the stack [compile.expr]. It executes linearly (no branches) and leaves the store 
unchanged. (This is the familiar translation from algebraic notation to reverse 
Polish notation.) 

comp(.T) = var(;r) 
comp(n) = const(n) 
comp(ei + e 2 ) = comp(ei); comp(e 2 ); add 



comp(ei — e 2 ) = comp(ei); comp(e 2 ); sub 



code for ei 
code for e 2 
bne(») — 
code for ci 
branch(»)- 
code for C2 



code for ei 
code for e 2 
bne(») — 
code for c 
branch(»}- 



Figure 2. Shape of generated code for if e\ = e2 then ci else C2 (left) and 
while ei = e2 do c done (right) 

The code comp(&, 5) for a boolean expression falls through if b is true, and branches 

to offset S if b is false. [compile.booLexpr] 

comp(ei = e 2 , S) = comp(ei); comp(e 2 ); bne((5) 
comp(ei < e 2 , S) = comp(ei); comp(e 2 ); bge((5) 

The code comp(c) for a command c updates the state according to the semantics 
of c, while leaving the stack unchanged, [compile.cmd] 

comp(skip) = e 
comp(a; := e) = comp(e); setvar(x) 
comp(ci; c 2 ) — comp(ci); comp(c 2 ) 
comp(if b then ci else c 2 ) = comp(6, Ci| + 1); Ci; branch(|C 2 |); C 2 

where Ci = comp(ci) and C 2 = comp(c 2 ) 
comp(while 6 do c done) = B; C; branch(— (\B\ + \C\ + 1)) 

where C = comp(c) and B = comp(fe, |C| + 1) 

\C\ is the length of a list of instructions C. The mysterious offsets in branch 
instructions are depicted in figure 2. 

Finally, the compilation of a program c is compile(c) = comp(c); halt, [com- 

pile.program] 

Combining the compilation scheme with the semantics of the virtual machine, 
we obtain a new way to execute a program c in initial state s: start the machine 
in code comp(c) and state (0, e, s) (program counter at first instruction of comp(c); 
empty stack; state s), and observe its behavior. Does this behavior agree with the 
behavior of c predicted by the semantics of section 2? 

4-3. Notions of semantic preservation 

Consider two programs Pi and P 2 , possibly in different languages. (For example, 
Pi is an IMP command and P 2 a sequence of VM instructions.) Under which 
conditions can we say that P 2 preserves the semantics of Pi? 



To make this question precise, we assume given operational semantics for the 
two languages that associate to Pi, P 2 sets B(Pi), B(P 2 ) of observable behaviors. 
In our case, observable behaviors are: termination on a final state s, divergence, 
and "going wrong" . The set B{P) contains exactly one element if P has deter- 
ministic semantics, two or more otherwise. 

Here are several possible formal characterizations of the informal claim that 
P 2 preserves the semantics of P\. 

• Bisimulation (equivalence): B(Pi) = B(P 2 ) 

• Backward simulation (refinement): B{P\) 3 B(P 2 ) 

• Backward simulation for correct source programs: if wrong ^ B(Pi) then 
B(Pi) D B(P 2 ) 

• Forward simulation: B(Pi) C B(P 2 ) 

• Forward simulation for correct source programs: if wrong £ B{P\) then 
B{Pi) C B(P 2 ) 

Bisimulation is the strongest notion of semantic preservation, ensuring that 
the two programs are indistinguishable. It is often too strong in practice. For 
example, the C language has non-deterministic semantics because the evaluation 
order for expressions is not fully specified; yet, C compilers choose one particu- 
lar evaluation order while generating deterministic machine code; therefore, the 
generated code has fewer behaviors than the source code. This intuition corre- 
sponds to the backward simulation property defined above: all behaviors of P 2 
are possible behaviors of Pi, but Pi can have more behaviors. 

In addition to reducing nondetcrminism, compilers routinely optimize away 
"going wrong" behaviors. For instance, the source program Pi contains an integer 
division z := x/y that can go wrong if y = 0, but the compiler eliminated this 
division because z is not used afterwards, therefore generating a program P 2 that 
does not go wrong if y = 0. This additional degree of liberty is reflected in the 
"backward simulation for correct source programs" above. 

Finally, the two "forward simulation" properties reverse the roles of Pi and 
P 2 , expressing the fact that any (non- wrong) behavior of the source program Pi 
is a possible behavior of the compiled code P 2 . Such forward simulation proper- 
ties are generally much easier to prove than backward simulations, but provide 
apparently weaker guarantees: P 2 could have additional behaviors, not exhibited 
by Pi , that are undesirable, such as "going wrong" . This cannot happen, however, 
if P 2 has deterministic semantics. 

Lemma 21 (Simulation and determinism.) If P 2 has deterministic semantics, then 
"forward simulation for correct programs " implies "backward simulation for cor- 
rect programs". 

In conclusion, for deterministic languages such as IMP and IMP virtual ma- 
chine code, "forward simulation for correct programs" is an appropriate notion of 
semantic preservation to prove the correctness of compilers and program trans- 
formations. 



4-4- Semantic preservation for the compiler 

Recall the informal specification for the code comp(e) generated by the compila- 
tion of expression e: it should evaluate e and push its value on top of the stack, ex- 
ecute linearly (no branches), and leave the store unchanged. Formally, we should 
have comp(e) : (0, a, s) A (|comp(e)|, ([e] s).a 7 s) for all stacks a and stores s. 
Note that pc = |comp(e)| means that the program counter is one past the last 
instruction in the sequence comp(e). To enable a proof by induction, we need to 
strengthen this result and consider codes of the form C\; comp(e); C 2 , where the 
code for e is bracketed by two arbitrary code sequences C\ and C 2 . The program 
counter, then, should go from \C\ \ (pointing to the first instruction of comp(e)) to 
\d\ + |comp(e)| (pointing one past the last instruction of comp(e), or equivalcntly 
to the first instruction of C 2 ). 

Lemma 22 [compile_expr_correct] For all instruction sequences C\,C 2 , stacks a and 
states s, 

d; comp(e); C 2 h (\d\,a,s) A (|Ci| + |comp(e)|, [e] s.a,s) 

The proof is a simple induction on the structure of e. Here is a represen- 
tative case: e = e\ + e 2 . Write v\ — \e{\ s and V2 = le 2 } s. The code C 
is C\; comp(ei); comp(e2); add; C 2 . Viewing C as C\; comp(ei); (comp(e2); add; C2), 
we can apply the induction hypothesis to ei, obtaining the transitions 

(\d\,a,s) A (|Ci| + |comp(ei)|,ui.CT, s) 

Likewise, viewing C as (d; comp(ei)); comp(e 2 ); (add; C2), wc can apply the in- 
duction hypothesis to e2, obtaining 

(|Ci;comp(ei)|,Ui.o-,s) A (\d; comp(ei)| + comp(e 2 )|, v 2 .v 1 .a, s) 

Combining these two sequences with an add transition, we obtain 

(\d\,o-,s) A (\d; comp(ei); comp(e 2 )| + 1, («i +v 2 ).cr, s) 

which is the desired result. 

The statement and proof of correctness for the compilation of boolean ex- 
pressions is similar. Here, the stack and the store are left unchanged, and control 
is transferred either to the end of the generated instruction sequence or to the 
given offset relative to this end, depending on the truth value of the condition. 

Lemma 23 [compile_bool_expr_correct] For all instruction sequences d,d, stacks a 
and states s, 

Ci;comp(6, 6); C 2 h (\d |, cr, s) A (pc,a,s) 
withpc — \Ci \ + |comp(6)| if [6] s = true and pc = \d \ + |comp(&)| +8 otherwise. 



To show semantic preservation between an IMP command and its compiled 
code, we prove a "forward simulation for correct programs" result. We therefore 
have two cases to consider: (1) the command terminates normally, and (2) the 
command diverges. In both cases, we use the natural semantics to conduct the 
proof, since its compositional nature is a good match for the compositional nature 
of the compilation scheme. 

Theorem 24 [compile_cmd_correct_terminating] Assume c, s => s' . Then, for all instruc- 
tion sequences C\, C 2 and stack a , 

Ci;comp(c);C 2 h (\d\,a,s) -4 (|Ci| + |comp(c)|, a, s') 

The proof is by induction on a derivation of c, s =>■ s' and uses the same 
techniques as that of lemma 22. 

For the diverging case, we need the following special-purpose coinduction 
principle. 

Lemma 25 Let X be a set of (machine code, machine state) pairs such that 
V(C, S) e X, 3S', (C, S') elAChS^ S'. 

Then, for all (C, S) € X , we have C h S ff- (there exists an infinite sequence of 
transitions starting from S). 

The following theorem follows from the coinduction principle above applied to 
the set 

X = {(Ci;comp(c);C 2 ,(|Ci|,cr,s)) | c, s oo}. 

Theorem 26 [compile_cmd_correct_diverging] Assume c,s => oo. Then, for all instruc- 
tion sequences C\,C% and stacks a, 

Ci; comp(c);C 2 h (|Ci|, cr, s) fr 

This completes the proof of forward simulation for correct programs. 

4-5. Further reading 

The virtual machine used in this section matches a small subset of the Java Vir- 
tual Machine [34]. Other examples of mechanized verification of nonoptimizing 
compilers producing virtual machine code include Bertot [10] (for the IMP lan- 
guage), Klein and Nipkow [29] (for a subset of Java), and Grail and Leroy [32] (for 
call- by- value A-calculus). The latter two show forward simulation results; Bertot 
shows both forward and backward simulation, and concludes that backward simu- 
lation is considerably more difficult to prove. Other examples of difficult backward 
simulation arguments (not mechanized) can be found in [23] , for call- by-name and 
call-by-value A-calculus. 



Lemma 22 (correctness of compilation of arithmetic expression to stack ma- 
chine code) is historically important: it is the oldest published compiler correct- 
ness proof (McCarthy and Painter [36], in 1967) and the oldest mechanized com- 
piler correctness proof (Milner and Weyhrauch, [39], in 1972). Since then, a great 
many correctness proofs for compilers and compilation passes have been pub- 
lished, some of them being mechanized: Dave's bibliography [19] lists 99 references 
up to 2002. 

5. An example of optimizing program transformation: dead code elimination 

Compilers are typically structured as a sequence of program transformations, 
also called passes. Some passes translate from one language to another, lower- 
level language, closer to machine code. The compilation scheme of section 4 is a 
representative example. Other passes are optimizations: they rewrite the program 
to an equivalent, but more efficient program. For example, the optimized program 
runs faster, or is smaller. 

In this section, we study a representative optimization: dead code elimination. 
The purpose of this optimization, performed on the IMP source language, is to 
remove assignments x :— e (turning them into skip instructions) such that the 
value of x is not used in the remainder of the program. This reduces both the 
execution time and the code size. 

Example. Consider the command x:=l;y:=y+l;x:=2. The assign- 
ment x := 1 can always be eliminated since x is not referenced before being 
redefined by x := 2. 

To detect the fact that the value of a variable is not used later, we need a 
static analysis known as liveness analysis. 

5.1. Liveness analysis 

A variable is dead at a program point if its value is not used later in the execution 
of the program: either the variable is never mentioned again, or it is always 
redefined before further use. A variable is live if it is not dead. 

Given a set A of variables live "after" a command c, the function live(c, A) 
over-approximates the set of variables live "before" the command [live]. It proceeds 
by a form of reverse execution of c, conservatively assuming that conditional 
branches can go both ways. FV computes the set of variables referenced in an 

expression [fv_expr] [fv.booLexpr]. 



live(skip, A) 



A 




live((ci;c 2 ), A) 
live((if b then C\ else C2), A) 
live((while b do c done), A) 



live(ci, live(c2, A)) 

FV(b) U live(ci, A) U live(c 2 , A) 

f ix(AX A U FV{b) U live(c, X)) 



If F is a function from sets of variables to sets of variables, f ix(F) is supposed 
to compute a post-fixpoint of F, that is, a set X such that F(X) C X. Typically, 
F is iterated n times, starting from the empty set, until we reach an n such 
that F™ +1 (0) C F"(0). Ensuring termination of such an iteration is, in general, a 
difficult problem. (See section 5.4 for discussion.) To keep things simple, we bound 
arbitrarily to N the number of iterations, and return a default over-approximation 
if a post-fixpoint cannot be found within N iterations: [fixpoint] 

f ix(F default) = [ if 3n ^ ^' ^ Q 

' [ default otherwise 

Here, a suitable default is A U ^^(while b do c done), the set of variables live 
"after" the loop or referenced within the loop. 

live((while b do c done), A) = f±x(\X. A U FV(b) U live(c, X), 

A U FV (while b do c done)) 

Lemma 27 [live_while_charact] Let A' = live(while b do c done, A). Then: 

FV(b) C A' A C A' live(c, A') C A' 
5.2. Dead code elimination 

The program transformation that eliminates dead code is, then: [dee] 
dce(skip, A) = skip 

ice(x:=e, ^ = {^7/ |[ ^ J 

dce((ci; c 2 ), A) = dce(ci, live(c 2 , A)); dce(c 2 , A) 
dce((if 6 then c\ else c 2 ), A) = if b then dce(ci, A) else dce(c 2 , A) 
dce(while b do c done, A) = while b do dce(c, A) done 

Example. Consider again the "Euclidean division" program c: 

r := a; q := 0; while b < r+1 dor:=r-b;q:=q+l done 

If q is not live "after" (q ^ A), it is not live throughout this program cither. 
Therefore, dce(c, A) produces 

r := a; skip; while b < r+1 do r := r - b; skip done 



The useless computations of q have been eliminated entirely, in a process similar 
to program slicing. In contrast, if q is live "after" (q 6 A), all computations are 
necessary and dce(c, A) returns c unchanged. 



5.3. Correctness of the transformation 



We show a "forward simulation for correct programs" property: 

• If c, s JJ- s', then dce(c, A), s -IJ- s" for some s" related to s'. 

• If c, s ft , then dce(c, A), s ft . 

However, the program dce(c, A) performs fewer assignments than c, therefore 
the final states can differ on the values of dead variables. We define agreement 
between two states s, s' with respect to a set of live variables A. [agree] 

s « s' : A d = f Vie A, s(x) = s'(x) 

Lemma 28 [eval_expr_agree] [eval_bool_expr_agree] Assume S « s' : A. If FV(e) C A, 

i/ien [e] s = [e] a'. If FV(b) C A, tfcen [b] s - [6] s'. 

The following two key lemmas show that agreement is preserved by parallel 
assignment to a live variable, or by unilateral assignment to a dead variable. The 
latter case corresponds to the replacement of x := e by skip. 

Lemma 29 [agree_update_live] (Assignment to a live variable.) If s rts s' : A \ {x}, 
then s[x <h- v] w s'[x <— v] : A. 

Lemma 30 [agree_update_dead] (Assignment to a dead variable.) If s w s' : A and 

x £ A, then s[x <— v] w s' : A. 

Using these lemmas, we can show forward simulation diagrams both for ter- 
minating and diverging commands c. In both case, we assume agreement on the 
variables live "before" c, namely live(c, A). 

Theorem 31 [dce_correct_terminating] If c, s =>■ s' and s w Si : live(c, A), then there 
exists s[ such that dce(c, A),si => s[ and s' w Sj : A. 

Theorem 32 [dce_correct_diverging] If c, s => oo and s w Si : live(c, A) ; f/ien 
dce(c, A), Si ^> oo. 

5.4- Further reading 

Dozens of compiler optimizations are known, each targeting a particular class of 
inefficiencies. See Appel [3] for an introduction to optimization, and Muchnick 
[42] for a catalog of classic optimizations. 

The results of liveness analysis can be exploited to perform register allocation 
(a crucial optimization performance- wise) , following Chaitin's approach [17] [3, 
chap. 11]: coloring of an interference graph. A mechanized proof of correctness for 
graph coloring-based register allocation, extending the proof given in this section, 
is described by Leroy [31,30]. 

Liveness analysis is an instance of a more general class of static analyses 
called dataflow analyses [3, chap. 17], themselves being a special case of abstract 
interpretation. Bertot et al. [14] and Leroy [30] prove, in Coq, the correctness of 



several optimizations based on dataflow analyses, such as constant propagation 
and common subexpression elimination. Cachera et al. [16] present a reusable Coq 
framework for dataflow analyses. 

Dataflow analyses are generally carried on an unstructured representation of 
the program called the control-flow graph. Dataflow equations are set up between 
the nodes of this graph, then solved by one global fixpoint iteration, often based 
on Kildall's worklist algorithm [27]. This is more efficient than the approach we 
described (computing a local fixpoint for each loop), which can be exponential 
in the nesting degree of loops. Kildall's worklist algorithm has been mechanically 
verified many times [14,18,29]. 

The effective computation of fixpoints is a central issue in static analysis. 
Theorems such as Knaster-Tarski's show the existence of fixpoints in many cases, 
and can be mechanized [47,15], but fail to provide effective algorithms. Noetherian 
recursion can be used if the domain of the analysis is well founded (no infinite 
chains) [13, chap. 15], but this property is difficult to ensure in practice [16]. The 
shortcut we took in this section (bounding arbitrarily the number of iterations) 
is inelegant but a reasonable engineering compromise. 

6. State of the art and current trends 

While this lecture was illustrated using "toy" languages and machines, the tech- 
niques we presented, based on operational and axiomatic semantics and on their 
mechanization using proof assistants, do scale to realistic programming languages 
and systems. Here are some recent achievements using similar techniques, in re- 
verse chronological order. 

• The verification of the seL4 secure micro-kernel (http://nicta.com.au/ 
research/pro j ects/14 . verif ied/) [28] . 

• The CompCert verified compiler: a realistic, moderately-optimizing com- 
piler for a large subset of the C language down to PowerPC and ARM 
assembly code, (http://compcert.inria.fr/) [31]. 

• The Verisoft project (http://www.verisoft.de/), which aims at the end- 
to-end formal verification of a complete embedded system, from hardware 
to application. 

• Formal specifications of the Java / Java Card virtual machines and mech- 
anized verifications of the Java bytecodc verifier: Ninja [29], Jakarta [7], 
Bicolano (http://mobius.inria.fr/twiki/bin/view/Bicolano), and 
the Kestrel Institute project (http://www.kestrel.edu/home/projects/ 
java/). 

• Formal verification of the ARM6 processor micro-architecture against the 
ARM instruction set specification [21] 

• The "foundational" approach to Proof-Carrying Code [4]. 

• The CLI stack: a formally verified microprocessor and compiler from 
an assembly-level language (http://www.cs.utexas.edu/~moore/ 
best- ideas/piton/index. html) [40]. 

Here are some active research topics in this area. 



Combining static analysis and program proof. Static analysis can be viewed as 
the automatic generation of logical assertions, enabling the results of static analy- 
sis to be verified a posteriori using a program logic, and facilitating the annotation 
of existing code with logical assertions. 

Proof-preserving compilation. Given a source program annotated with assertions 
and a proof in axiomatic semantics, can we produce machine code annotated with 
the corresponding assertions and the corresponding proof? [8,33]. 

Binders and a- conversion. A major obstacle to the mechanization of rich lan- 
guage semantics and advanced type systems is the handling of bound variables 
and the fact that terms containing binders are equal modulo a-conversion of 
bound variables. The POPLmark challenge explores this issue [6]. 

Shared-memory concurrency. Shared-memory concurrency raises major seman- 
tic difficulties, ranging from formalizing the "weakly-consistent" memory models 
implemented by today's multicore processors [52] to mechanizing program logics 
appropriate for proving concurrent programs correct [20,25]. 

Progressing towards fully-verified development and verification environments for 
high-assurance software. Beyond verifying compilers and other code generation 
tools, we'd like to gain formal assurance in the correctness of program verification 
tools such as static analyzers and program provers. 
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